Exploratory Data Analysis on Zomato Dataset¶

Importing Necessary Libraries¶

In [1]:
import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
import plotly.express as px
import folium
In [55]:
pip install openpyxl          # python library used to read and write excel files
Note: you may need to restart the kernel to use updated packages.
ERROR: Invalid requirement: '#'
In [3]:
pd.set_option('display.max_columns',None)    
In [4]:
#importing zomato dataset csv file
df = pd.read_csv(r"C:\Users\lenovo\Downloads\Zomato\zomato.csv", encoding="ISO-8859-1")
In [5]:
#imorting another data file in xlsx form
df2 = pd.read_excel(r"C:\Users\lenovo\Downloads\Zomato\Country-Code.xlsx")

Understanding the data¶

In [6]:
df.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 9551 entries, 0 to 9550
Data columns (total 21 columns):
 #   Column                Non-Null Count  Dtype  
---  ------                --------------  -----  
 0   Restaurant ID         9551 non-null   int64  
 1   Restaurant Name       9551 non-null   object 
 2   Country Code          9551 non-null   int64  
 3   City                  9551 non-null   object 
 4   Address               9551 non-null   object 
 5   Locality              9551 non-null   object 
 6   Locality Verbose      9551 non-null   object 
 7   Longitude             9551 non-null   float64
 8   Latitude              9551 non-null   float64
 9   Cuisines              9542 non-null   object 
 10  Average Cost for two  9551 non-null   int64  
 11  Currency              9551 non-null   object 
 12  Has Table booking     9551 non-null   object 
 13  Has Online delivery   9551 non-null   object 
 14  Is delivering now     9551 non-null   object 
 15  Switch to order menu  9551 non-null   object 
 16  Price range           9551 non-null   int64  
 17  Aggregate rating      9551 non-null   float64
 18  Rating color          9551 non-null   object 
 19  Rating text           9551 non-null   object 
 20  Votes                 9551 non-null   int64  
dtypes: float64(3), int64(5), object(13)
memory usage: 1.5+ MB

Descriptive Statistics:¶

In [7]:
df.describe()
Out[7]:
Restaurant ID Country Code Longitude Latitude Average Cost for two Price range Aggregate rating Votes
count 9.551000e+03 9551.000000 9551.000000 9551.000000 9551.000000 9551.000000 9551.000000 9551.000000
mean 9.051128e+06 18.365616 64.126574 25.854381 1199.210763 1.804837 2.666370 156.909748
std 8.791521e+06 56.750546 41.467058 11.007935 16121.183073 0.905609 1.516378 430.169145
min 5.300000e+01 1.000000 -157.948486 -41.330428 0.000000 1.000000 0.000000 0.000000
25% 3.019625e+05 1.000000 77.081343 28.478713 250.000000 1.000000 2.500000 5.000000
50% 6.004089e+06 1.000000 77.191964 28.570469 400.000000 2.000000 3.200000 31.000000
75% 1.835229e+07 1.000000 77.282006 28.642758 700.000000 2.000000 3.700000 131.000000
max 1.850065e+07 216.000000 174.832089 55.976980 800000.000000 4.000000 4.900000 10934.000000
In [8]:
df.isnull().sum()
Out[8]:
Restaurant ID           0
Restaurant Name         0
Country Code            0
City                    0
Address                 0
Locality                0
Locality Verbose        0
Longitude               0
Latitude                0
Cuisines                9
Average Cost for two    0
Currency                0
Has Table booking       0
Has Online delivery     0
Is delivering now       0
Switch to order menu    0
Price range             0
Aggregate rating        0
Rating color            0
Rating text             0
Votes                   0
dtype: int64
In [9]:
df.drop_duplicates()
Out[9]:
Restaurant ID Restaurant Name Country Code City Address Locality Locality Verbose Longitude Latitude Cuisines Average Cost for two Currency Has Table booking Has Online delivery Is delivering now Switch to order menu Price range Aggregate rating Rating color Rating text Votes
0 6317637 Le Petit Souffle 162 Makati City Third Floor, Century City Mall, Kalayaan Avenu... Century City Mall, Poblacion, Makati City Century City Mall, Poblacion, Makati City, Mak... 121.027535 14.565443 French, Japanese, Desserts 1100 Botswana Pula(P) Yes No No No 3 4.8 Dark Green Excellent 314
1 6304287 Izakaya Kikufuji 162 Makati City Little Tokyo, 2277 Chino Roces Avenue, Legaspi... Little Tokyo, Legaspi Village, Makati City Little Tokyo, Legaspi Village, Makati City, Ma... 121.014101 14.553708 Japanese 1200 Botswana Pula(P) Yes No No No 3 4.5 Dark Green Excellent 591
2 6300002 Heat - Edsa Shangri-La 162 Mandaluyong City Edsa Shangri-La, 1 Garden Way, Ortigas, Mandal... Edsa Shangri-La, Ortigas, Mandaluyong City Edsa Shangri-La, Ortigas, Mandaluyong City, Ma... 121.056831 14.581404 Seafood, Asian, Filipino, Indian 4000 Botswana Pula(P) Yes No No No 4 4.4 Green Very Good 270
3 6318506 Ooma 162 Mandaluyong City Third Floor, Mega Fashion Hall, SM Megamall, O... SM Megamall, Ortigas, Mandaluyong City SM Megamall, Ortigas, Mandaluyong City, Mandal... 121.056475 14.585318 Japanese, Sushi 1500 Botswana Pula(P) No No No No 4 4.9 Dark Green Excellent 365
4 6314302 Sambo Kojin 162 Mandaluyong City Third Floor, Mega Atrium, SM Megamall, Ortigas... SM Megamall, Ortigas, Mandaluyong City SM Megamall, Ortigas, Mandaluyong City, Mandal... 121.057508 14.584450 Japanese, Korean 1500 Botswana Pula(P) Yes No No No 4 4.8 Dark Green Excellent 229
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
9546 5915730 NamlÛ± Gurme 208 ÛÁstanbul Kemankeô Karamustafa Paôa Mahallesi, RÛ±htÛ±... Karakí_y Karakí_y, ÛÁstanbul 28.977392 41.022793 Turkish 80 Turkish Lira(TL) No No No No 3 4.1 Green Very Good 788
9547 5908749 Ceviz AÛôacÛ± 208 ÛÁstanbul Koôuyolu Mahallesi, Muhittin íìstí_ndaÛô Cadd... Koôuyolu Koôuyolu, ÛÁstanbul 29.041297 41.009847 World Cuisine, Patisserie, Cafe 105 Turkish Lira(TL) No No No No 3 4.2 Green Very Good 1034
9548 5915807 Huqqa 208 ÛÁstanbul Kuruí_eôme Mahallesi, Muallim Naci Caddesi, N... Kuruí_eôme Kuruí_eôme, ÛÁstanbul 29.034640 41.055817 Italian, World Cuisine 170 Turkish Lira(TL) No No No No 4 3.7 Yellow Good 661
9549 5916112 Aôôk Kahve 208 ÛÁstanbul Kuruí_eôme Mahallesi, Muallim Naci Caddesi, N... Kuruí_eôme Kuruí_eôme, ÛÁstanbul 29.036019 41.057979 Restaurant Cafe 120 Turkish Lira(TL) No No No No 4 4.0 Green Very Good 901
9550 5927402 Walter's Coffee Roastery 208 ÛÁstanbul CafeaÛôa Mahallesi, BademaltÛ± Sokak, No 21/B,... Moda Moda, ÛÁstanbul 29.026016 40.984776 Cafe 55 Turkish Lira(TL) No No No No 2 4.0 Green Very Good 591

9551 rows × 21 columns

In [10]:
df2.head()
Out[10]:
Country Code Country
0 1 India
1 14 Australia
2 30 Brazil
3 37 Canada
4 94 Indonesia

Merging Two Dataset to make Complete one¶

In [11]:
df = pd.merge(df,df2, on = 'Country Code', how = 'left')
In [12]:
df.head()
Out[12]:
Restaurant ID Restaurant Name Country Code City Address Locality Locality Verbose Longitude Latitude Cuisines Average Cost for two Currency Has Table booking Has Online delivery Is delivering now Switch to order menu Price range Aggregate rating Rating color Rating text Votes Country
0 6317637 Le Petit Souffle 162 Makati City Third Floor, Century City Mall, Kalayaan Avenu... Century City Mall, Poblacion, Makati City Century City Mall, Poblacion, Makati City, Mak... 121.027535 14.565443 French, Japanese, Desserts 1100 Botswana Pula(P) Yes No No No 3 4.8 Dark Green Excellent 314 Phillipines
1 6304287 Izakaya Kikufuji 162 Makati City Little Tokyo, 2277 Chino Roces Avenue, Legaspi... Little Tokyo, Legaspi Village, Makati City Little Tokyo, Legaspi Village, Makati City, Ma... 121.014101 14.553708 Japanese 1200 Botswana Pula(P) Yes No No No 3 4.5 Dark Green Excellent 591 Phillipines
2 6300002 Heat - Edsa Shangri-La 162 Mandaluyong City Edsa Shangri-La, 1 Garden Way, Ortigas, Mandal... Edsa Shangri-La, Ortigas, Mandaluyong City Edsa Shangri-La, Ortigas, Mandaluyong City, Ma... 121.056831 14.581404 Seafood, Asian, Filipino, Indian 4000 Botswana Pula(P) Yes No No No 4 4.4 Green Very Good 270 Phillipines
3 6318506 Ooma 162 Mandaluyong City Third Floor, Mega Fashion Hall, SM Megamall, O... SM Megamall, Ortigas, Mandaluyong City SM Megamall, Ortigas, Mandaluyong City, Mandal... 121.056475 14.585318 Japanese, Sushi 1500 Botswana Pula(P) No No No No 4 4.9 Dark Green Excellent 365 Phillipines
4 6314302 Sambo Kojin 162 Mandaluyong City Third Floor, Mega Atrium, SM Megamall, Ortigas... SM Megamall, Ortigas, Mandaluyong City SM Megamall, Ortigas, Mandaluyong City, Mandal... 121.057508 14.584450 Japanese, Korean 1500 Botswana Pula(P) Yes No No No 4 4.8 Dark Green Excellent 229 Phillipines
In [13]:
df.shape
Out[13]:
(9551, 22)
In [14]:
df.columns
Out[14]:
Index(['Restaurant ID', 'Restaurant Name', 'Country Code', 'City', 'Address',
       'Locality', 'Locality Verbose', 'Longitude', 'Latitude', 'Cuisines',
       'Average Cost for two', 'Currency', 'Has Table booking',
       'Has Online delivery', 'Is delivering now', 'Switch to order menu',
       'Price range', 'Aggregate rating', 'Rating color', 'Rating text',
       'Votes', 'Country'],
      dtype='object')

Initial Data Exploration¶

In [15]:
values = df.Country.value_counts().values
In [16]:
index = df.Country.value_counts().index
In [17]:
plt.figure(figsize  = (12,6))
plt.pie(values[0:3], labels = index[0:3]);
In [18]:
df.columns
Out[18]:
Index(['Restaurant ID', 'Restaurant Name', 'Country Code', 'City', 'Address',
       'Locality', 'Locality Verbose', 'Longitude', 'Latitude', 'Cuisines',
       'Average Cost for two', 'Currency', 'Has Table booking',
       'Has Online delivery', 'Is delivering now', 'Switch to order menu',
       'Price range', 'Aggregate rating', 'Rating color', 'Rating text',
       'Votes', 'Country'],
      dtype='object')
In [19]:
ratings=df.groupby(['Aggregate rating','Rating color','Rating text']).size().reset_index().rename(columns={0:"Rating Count"})
In [20]:
ratings
Out[20]:
Aggregate rating Rating color Rating text Rating Count
0 0.0 White Not rated 2148
1 1.8 Red Poor 1
2 1.9 Red Poor 2
3 2.0 Red Poor 7
4 2.1 Red Poor 15
5 2.2 Red Poor 27
6 2.3 Red Poor 47
7 2.4 Red Poor 87
8 2.5 Orange Average 110
9 2.6 Orange Average 191
10 2.7 Orange Average 250
11 2.8 Orange Average 315
12 2.9 Orange Average 381
13 3.0 Orange Average 468
14 3.1 Orange Average 519
15 3.2 Orange Average 522
16 3.3 Orange Average 483
17 3.4 Orange Average 498
18 3.5 Yellow Good 480
19 3.6 Yellow Good 458
20 3.7 Yellow Good 427
21 3.8 Yellow Good 400
22 3.9 Yellow Good 335
23 4.0 Green Very Good 266
24 4.1 Green Very Good 274
25 4.2 Green Very Good 221
26 4.3 Green Very Good 174
27 4.4 Green Very Good 144
28 4.5 Dark Green Excellent 95
29 4.6 Dark Green Excellent 78
30 4.7 Dark Green Excellent 42
31 4.8 Dark Green Excellent 25
32 4.9 Dark Green Excellent 61
In [21]:
sns.countplot(x='Rating color',data=ratings)
Out[21]:
<Axes: xlabel='Rating color', ylabel='count'>

Good Ratings distribution among different countries¶

In [22]:
df[df['Rating color']=='Green'].groupby('Country').size().sort_values(ascending = False).plot(kind = 'bar')
Out[22]:
<Axes: xlabel='Country'>

No Ratings distribution among different countries¶

In [23]:
df[df['Rating color']=='White'].groupby('Country').size().sort_values(ascending = False).plot(kind= 'bar')
Out[23]:
<Axes: xlabel='Country'>

Bad Ratings distribution among different countries¶

In [24]:
df[df['Rating color']=='Red'].groupby('Country').size().sort_values(ascending = False).plot(kind= 'bar')
Out[24]:
<Axes: xlabel='Country'>

Excellent Ratings distribution among different countries¶

In [25]:
df[df['Rating color']=='Dark Green'].groupby('Country').size().sort_values(ascending=False).plot(kind='bar')
Out[25]:
<Axes: xlabel='Country'>

Top 10 Cuisines around the world¶

In [26]:
df.Cuisines.value_counts().head(10).plot(kind= 'pie')
Out[26]:
<Axes: ylabel='Cuisines'>
In [27]:
df.Cuisines.value_counts().head(20).plot(kind= 'bar')
Out[27]:
<Axes: >

In previos analysis we can say majority of data is from India thats why we perform further analysis upon India for better underdstanding¶

In [28]:
india = df[df['Country']=='India']
india
Out[28]:
Restaurant ID Restaurant Name Country Code City Address Locality Locality Verbose Longitude Latitude Cuisines Average Cost for two Currency Has Table booking Has Online delivery Is delivering now Switch to order menu Price range Aggregate rating Rating color Rating text Votes Country
624 3400025 Jahanpanah 1 Agra E 23, Shopping Arcade, Sadar Bazaar, Agra Cant... Agra Cantt Agra Cantt, Agra 78.011544 27.161661 North Indian, Mughlai 850 Indian Rupees(Rs.) No No No No 3 3.9 Yellow Good 140 India
625 3400341 Rangrezz Restaurant 1 Agra E-20, Shopping Arcade, Sadar Bazaar, Agra Cant... Agra Cantt Agra Cantt, Agra 0.000000 0.000000 North Indian, Mughlai 700 Indian Rupees(Rs.) No No No No 2 3.5 Yellow Good 71 India
626 3400005 Time2Eat - Mama Chicken 1 Agra Main Market, Sadar Bazaar, Agra Cantt, Agra Agra Cantt Agra Cantt, Agra 78.011608 27.160832 North Indian 500 Indian Rupees(Rs.) No No No No 2 3.6 Yellow Good 94 India
627 3400021 Chokho Jeeman Marwari Jain Bhojanalya 1 Agra 1/48, Delhi Gate, Station Road, Raja Mandi, Ci... Civil Lines Civil Lines, Agra 77.998092 27.195928 Rajasthani 400 Indian Rupees(Rs.) No No No No 2 4.0 Green Very Good 87 India
628 3400017 Pinch Of Spice 1 Agra 23/453, Opposite Sanjay Cinema, Wazipura Road,... Civil Lines Civil Lines, Agra 78.007553 27.201725 North Indian, Chinese, Mughlai 1000 Indian Rupees(Rs.) No No No No 3 4.2 Green Very Good 177 India
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
9271 2800100 D Cabana 1 Vizag Beach Road, Near Bus Stop, Sagar Nagar, Visakh... Sagar Nagar Sagar Nagar, Vizag 83.361377 17.764287 Continental, Seafood, Chinese, North Indian, B... 600 Indian Rupees(Rs.) No No No No 2 3.6 Yellow Good 193 India
9272 2800418 Kaloreez 1 Vizag Plot 95, Opposite St. Lukes Nursing School, Da... Siripuram Siripuram, Vizag 0.000000 0.000000 Cafe, North Indian, Chinese 400 Indian Rupees(Rs.) No No No No 2 3.7 Yellow Good 85 India
9273 2800881 Plot 17 1 Vizag Plot 17, Gangapur Layout, Siripuram, Vizag Siripuram Siripuram, Vizag 83.315281 17.719539 Burger, Pizza, Biryani 600 Indian Rupees(Rs.) No No No No 2 4.3 Green Very Good 172 India
9274 2800042 Vista - The Park 1 Vizag The Park, Beach Road, Pedda Waltair, Lawsons B... The Park, Lawsons Bay The Park, Lawsons Bay, Vizag 83.336840 17.721182 American, North Indian, Thai, Continental 1500 Indian Rupees(Rs.) No No No No 4 3.8 Yellow Good 74 India
9275 2800019 Flying Spaghetti Monster 1 Vizag 10-50-12/F2, Sai Dakshata Complex, Beside Leno... Waltair Uplands Waltair Uplands, Vizag 83.314942 17.721119 Italian 1400 Indian Rupees(Rs.) No No No No 3 4.4 Green Very Good 316 India

8652 rows × 22 columns

In [29]:
india.head()
Out[29]:
Restaurant ID Restaurant Name Country Code City Address Locality Locality Verbose Longitude Latitude Cuisines Average Cost for two Currency Has Table booking Has Online delivery Is delivering now Switch to order menu Price range Aggregate rating Rating color Rating text Votes Country
624 3400025 Jahanpanah 1 Agra E 23, Shopping Arcade, Sadar Bazaar, Agra Cant... Agra Cantt Agra Cantt, Agra 78.011544 27.161661 North Indian, Mughlai 850 Indian Rupees(Rs.) No No No No 3 3.9 Yellow Good 140 India
625 3400341 Rangrezz Restaurant 1 Agra E-20, Shopping Arcade, Sadar Bazaar, Agra Cant... Agra Cantt Agra Cantt, Agra 0.000000 0.000000 North Indian, Mughlai 700 Indian Rupees(Rs.) No No No No 2 3.5 Yellow Good 71 India
626 3400005 Time2Eat - Mama Chicken 1 Agra Main Market, Sadar Bazaar, Agra Cantt, Agra Agra Cantt Agra Cantt, Agra 78.011608 27.160832 North Indian 500 Indian Rupees(Rs.) No No No No 2 3.6 Yellow Good 94 India
627 3400021 Chokho Jeeman Marwari Jain Bhojanalya 1 Agra 1/48, Delhi Gate, Station Road, Raja Mandi, Ci... Civil Lines Civil Lines, Agra 77.998092 27.195928 Rajasthani 400 Indian Rupees(Rs.) No No No No 2 4.0 Green Very Good 87 India
628 3400017 Pinch Of Spice 1 Agra 23/453, Opposite Sanjay Cinema, Wazipura Road,... Civil Lines Civil Lines, Agra 78.007553 27.201725 North Indian, Chinese, Mughlai 1000 Indian Rupees(Rs.) No No No No 3 4.2 Green Very Good 177 India

Univariate Analysis:¶

Exploring the distribution of each variable individually.¶

In [30]:
#Plotting histograms for Average Cost for two

plt.figure(figsize=(10, 6))
sns.histplot(india['Average Cost for two'], bins=30, kde=True, color='skyblue')
plt.title('Distribution of Average Cost for Two')
plt.xlabel('Average Cost for Two')
plt.ylabel('Frequency')
plt.show()
In [31]:
#Plotting bar charts for categorical variables 

plt.figure(figsize=(8, 5))
sns.countplot(x='Has Table booking', data=india, palette='pastel')
plt.title('Count of Restaurants with Table Booking')
plt.xlabel('Has Table Booking')
plt.ylabel('Count')
plt.show()

Bivariate Analysis:¶

Explore relationships between pairs of variables.¶

In [32]:
# Scatter plot 

plt.figure(figsize=(10, 6))
sns.scatterplot(x='Average Cost for two', y='Aggregate rating', data=india, color='coral')
plt.title('Scatter Plot: Average Cost vs Aggregate Rating')
plt.xlabel('Average Cost for Two')
plt.ylabel('Aggregate Rating')
plt.show()
In [33]:
# Box plot 

plt.figure(figsize=(12, 8))
sns.boxplot(x='Has Table booking', y='Average Cost for two', data=df, palette='Set3')
plt.title('Box Plot: Average Cost for Two by Table Booking')
plt.xlabel('Has Table Booking')
plt.ylabel('Average Cost for Two')
plt.show()

Categorical Variables Analysis:¶

Analyzing counts and proportions of categorical variables like 'Has Table Booking' or 'Has Online Delivery'.¶

In [34]:
a=india.groupby('City').agg('Has Table booking').value_counts().sort_values(ascending=False)

a.unstack().fillna(0).astype(int).sort_values(['No','Yes'],ascending=False).reset_index()
Out[34]:
Has Table booking City No Yes
0 New Delhi 4758 715
1 Noida 968 112
2 Gurgaon 914 204
3 Faridabad 236 15
4 Ghaziabad 22 3
5 Ahmedabad 21 0
6 Amritsar 21 0
7 Bhubaneshwar 21 0
8 Guwahati 21 0
9 Lucknow 21 0
10 Agra 20 0
11 Allahabad 20 0
12 Aurangabad 20 0
13 Bhopal 20 0
14 Coimbatore 20 0
15 Dehradun 20 0
16 Goa 20 0
17 Indore 20 0
18 Jaipur 20 0
19 Kanpur 20 0
20 Kochi 20 0
21 Ludhiana 20 0
22 Mangalore 20 0
23 Mysore 20 0
24 Nagpur 20 0
25 Nashik 20 0
26 Patna 20 0
27 Puducherry 20 0
28 Ranchi 20 0
29 Surat 20 0
30 Vadodara 20 0
31 Varanasi 20 0
32 Vizag 20 0
33 Chandigarh 18 0
34 Bangalore 14 6
35 Kolkata 11 9
36 Mumbai 11 9
37 Hyderabad 8 10
38 Chennai 7 13
39 Pune 7 13
40 Mohali 1 0
41 Panchkula 1 0
42 Secunderabad 0 2
In [35]:
booking_counts = df['Has Table booking'].value_counts()
booking_proportions = df['Has Table booking'].value_counts(normalize=True)
In [36]:
# Bar plot for counts

plt.figure(figsize=(8, 5))
sns.countplot(x='Has Table booking', data=india, palette='Set2')
plt.title('Count of Restaurants with Table Booking')
plt.xlabel('Has Table Booking')
plt.ylabel('Count')
plt.show()
print("Table Booking Counts:")
print(booking_counts)
print("\nTable Booking Proportions:")
print(booking_proportions)
Table Booking Counts:
No     8393
Yes    1158
Name: Has Table booking, dtype: int64

Table Booking Proportions:
No     0.878756
Yes    0.121244
Name: Has Table booking, dtype: float64
In [37]:
india.groupby('City').agg('Has Online delivery').value_counts().sort_values(ascending=False).unstack().fillna(0).astype(int).sort_values(['No','Yes'],ascending=False)
Out[37]:
Has Online delivery No Yes
City
New Delhi 3984 1489
Noida 716 364
Gurgaon 693 425
Faridabad 216 35
Amritsar 21 0
Bhubaneshwar 21 0
Guwahati 21 0
Lucknow 21 0
Agra 20 0
Allahabad 20 0
Aurangabad 20 0
Bhopal 20 0
Dehradun 20 0
Goa 20 0
Indore 20 0
Kanpur 20 0
Ludhiana 20 0
Mangalore 20 0
Mysore 20 0
Nashik 20 0
Patna 20 0
Puducherry 20 0
Ranchi 20 0
Surat 20 0
Vadodara 20 0
Varanasi 20 0
Vizag 20 0
Ghaziabad 15 10
Kochi 15 5
Bangalore 13 7
Coimbatore 13 7
Mumbai 13 7
Pune 13 7
Kolkata 12 8
Chandigarh 12 6
Hyderabad 11 7
Ahmedabad 10 11
Jaipur 10 10
Nagpur 10 10
Chennai 7 13
Secunderabad 1 1
Panchkula 1 0
Mohali 0 1
In [38]:
delivery_counts = india['Has Online delivery'].value_counts()
delivery_proportions = india['Has Online delivery'].value_counts(normalize=True)
In [39]:
plt.figure(figsize=(8, 5))
sns.countplot(x='Has Online delivery', data=india, palette='pastel')
plt.title('Count of Restaurants with Online Delivery')
plt.xlabel('Has Online Delivery')
plt.ylabel('Count')
plt.show()
print("\nOnline Delivery Counts:")
print(delivery_counts)
print("\nOnline Delivery Proportions:")
print(delivery_proportions)
Online Delivery Counts:
No     6229
Yes    2423
Name: Has Online delivery, dtype: int64

Online Delivery Proportions:
No     0.719949
Yes    0.280051
Name: Has Online delivery, dtype: float64

Cuisine Analysis:¶

Exploring the distribution of cuisines.¶

In [40]:
# Splitting the cuisines and create a list of all unique cuisines
cuisine_list = india['Cuisines'].str.split(',').explode().str.strip().unique()
In [41]:
# Counting the occurrences of each cuisine in the dataset
cuisine_counts = india['Cuisines'].str.split(',').explode().str.strip().value_counts()
In [42]:
# Plotting the top N cuisines
top_n = 10 
plt.figure(figsize=(12, 8))
cuisine_counts.head(top_n).sort_values().plot(kind='bar', color='skyblue')
plt.title(f'Top {top_n} Popular Cuisines')
plt.xlabel('Count')
plt.ylabel('Cuisine')
plt.show()

Average Cost Analysis:¶

Restaurants that are expensive¶

In [43]:
costly = india[india['Average Cost for two']>3000].drop_duplicates()
costly
Out[43]:
Restaurant ID Restaurant Name Country Code City Address Locality Locality Verbose Longitude Latitude Cuisines Average Cost for two Currency Has Table booking Has Online delivery Is delivering now Switch to order menu Price range Aggregate rating Rating color Rating text Votes Country
633 3400072 Dawat-e-Nawab - Radisson Blu 1 Agra Radisson Blu, Taj East Gate Road, Tajganj, Agra Radisson Blu, Tajganj Radisson Blu, Tajganj, Agra 78.057044 27.163303 North Indian, Mughlai 3600 Indian Rupees(Rs.) No No No No 4 3.8 Yellow Good 46 India
1216 4358 Cafe G - Crowne Plaza 1 Gurgaon Crowne Plaza, NH-8, Sector 29, Gurgaon Crowne Plaza, Sector 29 Crowne Plaza, Sector 29, Gurgaon 77.060089 28.468433 North Indian, Continental, Chinese 3500 Indian Rupees(Rs.) Yes No No No 4 3.9 Yellow Good 181 India
1217 2443 Wildfire - Crowne Plaza 1 Gurgaon Crowne Plaza, National Highway 8, Sector 29, G... Crowne Plaza, Sector 29 Crowne Plaza, Sector 29, Gurgaon 77.059909 28.468415 South American 5000 Indian Rupees(Rs.) Yes No No No 4 3.7 Yellow Good 131 India
1438 8241 7 Degrees Brauhaus 1 Gurgaon 310 & 311, 3rd Floor, DLF South Point Mall, Go... DLF South Point Mall, Golf Course Road DLF South Point Mall, Golf Course Road, Gurgaon 77.099298 28.448173 Continental, North Indian, European, Finger Food 3200 Indian Rupees(Rs.) Yes No No No 4 4.2 Green Very Good 1193 India
1522 307416 I-Kandy - Le Meridien Gurgaon 1 Gurgaon Le Meridien Gurgaon, Sector 26, Gurgaon Delhi ... Le Meridien Gurgaon, MG Road Le Meridien Gurgaon, MG Road, Gurgaon 77.108727 28.481264 Finger Food 4500 Indian Rupees(Rs.) Yes No No No 4 3.6 Yellow Good 218 India
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
7540 2693 Varq - The Taj Mahal Hotel 1 New Delhi The Taj Mahal Hotel, 1, Mansingh Road, New Delhi The Taj Mahal Hotel, Mansingh Road The Taj Mahal Hotel, Mansingh Road, New Delhi 77.224140 28.605189 Seafood, North Indian 4500 Indian Rupees(Rs.) Yes No No No 4 4.2 Green Very Good 541 India
7542 18376469 Spicy Duck - Taj Palace Hotel 1 New Delhi Taj Palace Hotel, Diplomatic Enclave, Chanakya... The Taj Palace Hotel, Chanakyapuri The Taj Palace Hotel, Chanakyapuri, New Delhi 77.170220 28.594801 Asian 4000 Indian Rupees(Rs.) Yes No No No 4 3.5 Yellow Good 24 India
7543 2701 Orient Express - Taj Palace Hotel 1 New Delhi Taj Palace Hotel, Diplomatic Enclave, Chanakya... The Taj Palace Hotel, Chanakyapuri The Taj Palace Hotel, Chanakyapuri, New Delhi 77.170087 28.595008 European 8000 Indian Rupees(Rs.) Yes No No No 4 4.0 Green Very Good 145 India
8162 8351 Paatra - Jaypee Greens 1 Noida Jaypee Greens Golf & Spa Resort, G Block, Sura... Jaypee Greens Golf & Spa Resort, Surajpur Jaypee Greens Golf & Spa Resort, Surajpur, Noida 77.518139 28.469702 North Indian 3500 Indian Rupees(Rs.) Yes No No No 4 3.5 Yellow Good 79 India
8210 3924 S-18 - Radisson Blu 1 Noida Radisson Blu Hotel, L-2, Sector 18, Noida Radisson Blu, Sector 18, Noida Radisson Blu, Sector 18, Noida, Noida 77.322190 28.568598 Mediterranean, Continental, North Indian, Italian 3200 Indian Rupees(Rs.) Yes No No No 4 3.5 Yellow Good 218 India

76 rows × 22 columns

In [44]:
sns.barplot(x="City", y="Average Cost for two", data=costly,
             color="b")
Out[44]:
<Axes: xlabel='City', ylabel='Average Cost for two'>

Restaurants that are cheap¶

In [45]:
cheap=india[india['Average Cost for two']<800].sort_values(by='Average Cost for two').drop_duplicates().tail(60)
In [46]:
figg=px.pie(cheap,values='Average Cost for two',names='City')
figg.show()

Top rated restaurants in India¶

In [48]:
top_rated=india[india['Aggregate rating']>4.5]
In [56]:
figure=px.bar(top_rated,x='City',y='Aggregate rating')
figure.show()

Lowwest rated Restaurants in India¶

In [50]:
low_rated=india[india['Aggregate rating']<3.4]
In [51]:
figure=px.scatter(low_rated,x='City',y='Aggregate rating',color='Rating color')
figure.show()

Ratings Analysis:¶

In [52]:
# Distribution of 'Aggregate rating'
plt.figure(figsize=(10, 6))
sns.histplot(india['Aggregate rating'], bins=30, kde=True, color='skyblue')
plt.title('Distribution of Aggregate Ratings')
plt.xlabel('Aggregate Rating')
plt.ylabel('Frequency')
plt.show()

# Relationship between ratings and average cost
plt.figure(figsize=(10, 6))
sns.scatterplot(x='Aggregate rating', y='Average Cost for two', data=india, color='coral')
plt.title('Scatter Plot: Ratings vs Average Cost for Two')
plt.xlabel('Aggregate Rating')
plt.ylabel('Average Cost for Two')
plt.show()

Voting Analysis:¶

Distribution of 'Votes'¶

In [53]:
votes_stats = india['Votes'].describe()

# Distribution of 'Votes'
plt.figure(figsize=(10, 6))
sns.histplot(india['Votes'], bins=30, kde=True, color='skyblue')
plt.title('Distribution of Votes')
plt.xlabel('Votes')
plt.ylabel('Frequency')
plt.show()

# Scatter plot for 'Votes' against 'Aggregate rating'
plt.figure(figsize=(10, 6))
sns.scatterplot(x='Votes', y='Aggregate rating', data=india, color='coral')
plt.title('Scatter Plot: Votes vs Aggregate Rating')
plt.xlabel('Votes')
plt.ylabel('Aggregate Rating')
plt.show()

# Correlation between 'Votes' and 'Aggregate rating'
correlation_votes_rating = india['Votes'].corr(india['Aggregate rating'])
print(f'Correlation between Votes and Aggregate Rating: {correlation_votes_rating:.2f}')
Correlation between Votes and Aggregate Rating: 0.29

Geographic Analysis:¶

Plotting the restaurants on a map using Latitude and Longitude.¶

In [54]:
top_rated=india[india['Aggregate rating']>4.5]
fig = px.scatter_mapbox(top_rated, lat="Latitude", lon="Longitude", hover_name="City", hover_data=["Aggregate rating", "Restaurant Name"],
                        color_discrete_sequence=["fuchsia"], zoom=4, height=300)
fig.update_layout(mapbox_style="open-street-map")
fig.update_layout(margin={"r":0,"t":0,"l":0,"b":0})
fig.update_layout(title='Highly rated Resturants Location',
                  autosize=True,
                  hovermode='closest',
                  showlegend=False)
fig.update_layout(
    autosize=False,
    width=800,
    height=500,)

fig.show()
In [57]:
!apt-get install -y pandoc
'apt-get' is not recognized as an internal or external command,
operable program or batch file.
In [ ]: